Evolving Better Multiple Sequence Alignments

نویسندگان

  • Luke Sheneman
  • James A. Foster
چکیده

Aligning multiple DNA or protein sequences is a fundamental step in the analyses of phylogeny, homology and molecular structure. Heuristic algorithms are applied because optimal multiple sequence alignment is prohibitively expensive. Heuristic alignment algorithms represent a practical trade-off between speed and accuracy, but they can be improved. We present EVALYN (EVolved ALYNments), a novel approach to multiple sequence alignment in which sequences are progressively aligned based on a guide tree optimized by a genetic algorithm. We hypothesize that a genetic algorithm can find better guide trees than traditional, deterministic clustering algorithms. We compare our novel evolutionary approach to CLUSTAL W and find that EVALYN performs consistently and significantly better as measured by a common alignment scoring technique. Additionally, we hypothesize that evolutionary guide tree optimization is inherently efficient and has less time complexity than the commonly-used neighbor-joining algorithm. We present a compelling analysis in support of this scalability hypothesis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple molecular sequence alignment by island parallel genetic algorithm

This paper presents an evolution-based approach for solving multiple molecular sequence alignment. The approach is based on the island parallel genetic algorithm that relies on the fitness distribution over the population of alignments. The algorithm searches for an alignment among the independent isolated evolving populations by optimizing weighted sum of pairs objective function which measure...

متن کامل

Evolving Guide Trees in Progressive Multiple Sequence Alignment

We present a novel application of genetic algorithms to the problem of aligning multiple biological sequences through the optimization of guide trees. Individual guide trees are represented as coalescing binary trees which provide for efficient and meaningful crossover and mutation operations. We hypothesize that our technique avoids the limitations of other heuristic tree-building techniques, ...

متن کامل

SAGA: sequence alignment by genetic algorithm.

We describe a new approach to multiple sequence alignment using genetic algorithms and an associated software package called SAGA. The method involves evolving a population of alignments in a quasi evolutionary manner and gradually improving the fitness of the population as measured by an objective function which measures multiple alignment quality. SAGA uses an automatic scheduling scheme to c...

متن کامل

Two Phase Evolutionary Method for Multiple Sequence Alignments

This paper presents a new evolutionary method, namely, a Two Phase evolutionary algorithm for multiple sequence alignments. This method is composed of different types of evolutionary algorithms, that is, an evolutionary progressive multiple sequence alignment method (abbreviated to ET) and Sequence Alignment by Genetic Algorithm (abbreviated to SAGA). The former is employed to obtain efficientl...

متن کامل

SeqFIRE: a web application for automated extraction of indel regions and conserved blocks from protein multiple sequence alignments

Analyses of multiple sequence alignments generally focus on well-defined conserved sequence blocks, while the rest of the alignment is largely ignored or discarded. This is especially true in phylogenomics, where large multigene datasets are produced through automated pipelines. However, some of the most powerful phylogenetic markers have been found in the variable length regions of multiple al...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004